skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Mouli, S"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. Deep learning models tend not to be out-of-distribution robust primarily due to their reliance on spurious features to solve the task. Counterfactual data augmentations provide a general way of (approximately) achieving representations that are counterfactual-invariant to spurious features, a requirement for out-of-distribution (OOD) robustness. In this work, we show that counterfactual data augmentations may not achieve the desired counterfactual-invariance if the augmentation is performed by a context-guessing machine, an abstract machine that guesses the most-likely context of a given input. We theoretically analyze the invariance imposed by such counterfactual data augmentations and describe an exemplar NLP task where counterfactual data augmentation by a context-guessing machine does not lead to robust OOD classifiers. 
    more » « less
  2. Generalizing from observed to new related environments (out-of-distribution) is central to the reliability of classifiers. However, most classifiers fail to predict label from input when the change in environment is due a (stochastic) input transformation not observed in training, as in training we observe , where is a hidden variable. This work argues that when the transformations in train and test are (arbitrary) symmetry transformations induced by a collection of known equivalence relations, the task of finding a robust OOD classifier can be defined as finding the simplest causal model that defines a causal connection between the target labels and the symmetry transformations that are associated with label changes. We then propose a new learning paradigm, asymmetry learning, that identifies which symmetries the classifier must break in order to correctly predict in both train and test. Asymmetry learning performs a causal model search that, under certain identifiability conditions, finds classifiers that perform equally well in-distribution and out-of-distribution. Finally, we show how to learn counterfactually-invariant representations with asymmetry learning in two physics tasks. 
    more » « less
  3. Generalizing from observed to new related environments (out-of-distribution) is central to the reliability of classifiers. However, most classifiers fail to predict label from input when the change in environment is due a (stochastic) input transformation not observed in training, as in training we observe , where is a hidden variable. This work argues that when the transformations in train and test are (arbitrary) symmetry transformations induced by a collection of known equivalence relations, the task of finding a robust OOD classifier can be defined as finding the simplest causal model that defines a causal connection between the target labels and the symmetry transformations that are associated with label changes. We then propose a new learning paradigm, asymmetry learning, that identifies which symmetries the classifier must break in order to correctly predict in both train and test. Asymmetry learning performs a causal model search that, under certain identifiability conditions, finds classifiers that perform equally well in-distribution and out-of-distribution. Finally, we show how to learn counterfactually-invariant representations with asymmetry learning in two physics tasks. 
    more » « less
  4. Despite —or maybe because of— their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model’s inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not explicitly observed with infinitely many training examples have underspecified outcomes in the learner’s model. In order to endow neural networks with the ability to extrapolate over group transformations, we introduce a learning framework counterfactually-guided by the learning hypothesis that any group invariance to (known) transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. Unlike existing invariance-driven methods for (counterfactual) extrapolations, this framework allows extrapolations from a single environment. Finally, we introduce sequence and image extrapolation tasks that validate our framework and showcase the shortcomings of traditional approaches. 
    more » « less
  5. Roll-to-roll printing has significantly shortened the time from design to production of sensors and IoT devices, while being cost-effective for mass production. But due to less manufacturing tolerance controls available, properties such as sensor thickness, composition, roughness, etc., cannot be precisely controlled. Since these properties likely affect the sensor behavior, roll-to-roll printed sensors require validation testing before they can be deployed in the field. In this work, we improve the testing of Nitrate sensors that need to be calibrated in a solution of known Nitrate concentration for around 1–2 days. To accelerate this process, we observe the initial behavior of the sensors for a few hours, and use a physics-informed machine learning method to predict their measurements 24 hours in the future, thus saving valuable time and testing resources. Due to the variability in roll-to-roll printing, this prediction task requires models that are robust to changes in properties of the new test sensors. We show that existing methods fail at this task and describe a physics-informed machine learning method that improves the prediction robustness to different testing conditions (≈ 1.7× lower in real-world data and ≈ 5× lower in synthetic data when compared with the current state-of-the-art physics-informed machine learning method). 
    more » « less
  6. null (Ed.)
    Despite ---or maybe because of--- their astonishing capacity to fit data, neural networks are believed to have difficulties extrapolating beyond training data distribution. This work shows that, for extrapolations based on finite transformation groups, a model's inability to extrapolate is unrelated to its capacity. Rather, the shortcoming is inherited from a learning hypothesis: Examples not explicitly observed with infinitely many training examples have underspecified outcomes in the learner’s model. In order to endow neural networks with the ability to extrapolate over group transformations, we introduce a learning framework counterfactually-guided by the learning hypothesis that any group invariance to (known) transformation groups is mandatory even without evidence, unless the learner deems it inconsistent with the training data. 
    more » « less
  7. null (Ed.)
    Over-sharing poorly-worded thoughts and personal information is prevalent on online social platforms. In many of these cases, users regret posting such content. To retrospectively rectify these errors in users' sharing decisions, most platforms offer (deletion) mechanisms to withdraw the content, and social media users often utilize them. Ironically and perhaps unfortunately, these deletions make users more susceptible to privacy violations by malicious actors who specifically hunt post deletions at large scale. The reason for such hunting is simple: deleting a post acts as a powerful signal that the post might be damaging to its owner. Today, multiple archival services are already scanning social media for these deleted posts. Moreover, as we demonstrate in this work, powerful machine learning models can detect damaging deletions at scale. Towards restraining such a global adversary against users' right to be forgotten, we introduce Deceptive Deletion, a decoy mechanism that minimizes the adversarial advantage. Our mechanism injects decoy deletions, hence creating a two-player minmax game between an adversary that seeks to classify damaging content among the deleted posts and a challenger that employs decoy deletions to masquerade real damaging deletions. We formalize the Deceptive Game between the two players, determine conditions under which either the adversary or the challenger provably wins the game, and discuss the scenarios in-between these two extremes. We apply the Deceptive Deletion mechanism to a real-world task on Twitter: hiding damaging tweet deletions. We show that a powerful global adversary can be beaten by a powerful challenger, raising the bar significantly and giving a glimmer of hope in the ability to be really forgotten on social platforms. 
    more » « less
  8. A major open problem in proof complexity is to prove superpolynomial lower bounds for AC0[p]-Frege proofs. This system is the analog of AC0 [p], the class of bounded depth circuits with prime modular counting gates. Despite strong lower bounds for this class dating back thirty years ([28, 30]), there are no significant lower bounds for AC0 [p]-Frege. Significant and extensive degree lower bounds have been obtained for a variety of subsystems of AC0[p]-Frege, including Nullstellensatz ([3]), Polynomial Calculus ([9]), and SOS ([14]). However to date there has been no progress on AC0 [p]-Frege lower bounds. In this paper we study constant-depth extensions of the Polynomial Calculus [13]. We show that these extensions are much more powerful than was previously known. Our main result is that small depth (≤ 43) Polynomial Calculus (over a sufficiently large field) can polynomially effectively simulate all of the well-studied semialgebraic proof systems: Cutting Planes, Sherali-Adams, Sum-of-Squares (SOS), and Positivstellensatz Calculus (Dynamic SOS). Additionally, they can also quasi-polynomially effectively simulate AC0[q]-Frege for any prime q independent of the characteristic of the underlying field. They can also effectively simulate TC0-Frege if the depth is allowed to grow proportionally. Thus, proving strong lower bounds for constant-depth extensions of Polynomial Calculus would not only give lower bounds for AC0 [p]-Frege, but also for systems as strong as TC0-Frege. 
    more » « less